2023

Genome Index Building

Launch interactive job (ijob)

ijob -c 8 --mem-per-cpu=32000  -p standard --time=03:00:00

Genome Index Building

Message during the process

Results inside the genome index folder

Fastqc


Fastqc checks the quality of the raw RNA seq data file. It appears that recent RNA seq read quality is much better than when fastqc was developed.


multiqc can read all report files in a folder including its subfolders and produce a single html report file.

Trimmomatic


Trimmomatic reads paired read files (option: PE, file names _1 and _2) and separates reads which can not be paired.

Instead of running a for-loop over fastq files that executes trimmomatic, which will run them serial manner, consider making a for-loop producing slurm job submission files and submit all of them.

Aligned by star

Results from Trimmomatic go to star for alignment to a reference. Output is BAM files.

Instead of running a for-loop over fastq files that executes star, which will run them serial manner, consider making a for-loop producing slurm job submission files and submit all of them.

FeatureCount

FeatureCount counts mapped reads by star. Currently, Rivanna doesn’t have featurecount. Download a docker container image and use it.

A useful Rivanna help page has

module load singularity
singularity pull docker://account/image

FeatureCount takes all BAM files and produces one output file in txt format.

Taking the results to R

Read the txt file that contains all counting data into a data frame. The column names can be a bit messy. Trim them so that they become sample names. That can be feed to a package like DESeq2.

Volcano plot, heatmap, comparison condition

EnrichGO

MSigDB H